Skip to content

Conversation

@SyMind
Copy link
Collaborator

@SyMind SyMind commented Feb 9, 2026

No description provided.

Copilot AI review requested due to automatic review settings February 9, 2026 07:13
@gemini-code-assist
Copy link

Summary of Changes

Hello @SyMind, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on optimizing the ConcatSource hashing process by introducing parallelism. By leveraging the rayon crate, the computation of individual child hashes is now distributed across multiple threads, which is expected to yield performance improvements, especially for sources composed of many children. The change also includes a new test to confirm that the parallel hashing remains deterministic and consistent.

Highlights

  • Parallel Hashing: The ConcatSource hashing mechanism has been refactored to utilize parallel processing for its children, significantly improving performance for complex source structures.
  • Dependency Update: The rayon crate, a library for data parallelism, has been added as a direct dependency to enable parallel execution.
  • Test Coverage: A new test case has been introduced to specifically verify the deterministic nature of the ConcatSource hashing when multiple children are involved, ensuring correctness after parallelization.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • Cargo.toml
    • Added rayon = "1.11.0" as a direct dependency.
    • Removed rayon from the criterion development dependencies section, promoting it to a top-level dependency.
  • src/concat_source.rs
    • Imported rayon::prelude::* and rustc_hash::FxHasher to support parallel hashing.
    • Modified the impl Hash for ConcatSource to iterate over children in parallel using par_iter().
    • Each child's hash is now computed independently using FxHasher and then combined sequentially into a final hash.
    • Added a new test test_hash_is_deterministic_for_many_children to ensure the parallel hashing produces consistent results.
Activity
  • No specific activity (comments, reviews, etc.) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request parallelizes the hashing of ConcatSource children to improve performance, using the rayon crate. The implementation computes hashes of children in parallel and then combines them. While this is a good performance optimization for sources with many children, I've pointed out a potential performance regression for sources with few children due to the overhead of parallelization. I've suggested introducing a threshold to switch between sequential and parallel hashing to address this.

Comment on lines +233 to +240
let child_hashes: Vec<u64> = children
.par_iter()
.map(|child| {
let mut hasher = FxHasher::default();
child.hash(&mut hasher);
hasher.finish()
})
.collect();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using par_iter unconditionally can lead to performance degradation for ConcatSource with a small number of children due to the overhead of parallelization. This could be a performance regression in those cases.

Consider adding a threshold to switch between sequential and parallel hashing. For example, you could use iter() for a small number of children and par_iter() for a larger number.

To maintain hash consistency, the sequential path must use the same hashing logic as the parallel one (i.e., creating an FxHasher for each child and then combining the hashes). A possible implementation could look like this:

const PAR_HASH_THRESHOLD: usize = 16; // Should be benchmarked

let child_hashes: Vec<u64> = if children.len() < PAR_HASH_THRESHOLD {
    children
        .iter()
        .map(|child| {
            let mut hasher = FxHasher::default();
            child.hash(&mut hasher);
            hasher.finish()
        })
        .collect()
} else {
    children
        .par_iter()
        .map(|child| {
            let mut hasher = FxHasher::default();
            child.hash(&mut hasher);
            hasher.finish()
        })
        .collect()
};

@codspeed-hq
Copy link

codspeed-hq bot commented Feb 9, 2026

Merging this PR will not alter performance

✅ 11 untouched benchmarks
⏩ 6 skipped benchmarks1


Comparing compute-hash-by-rayon (27c3f66) with main (74beb79)

Open in CodSpeed

Footnotes

  1. 6 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR parallelizes ConcatSource’s Hash implementation by hashing each optimized child source concurrently and then combining those hashes, requiring rayon as a normal dependency.

Changes:

  • Add rayon and use par_iter() to compute per-child hashes in parallel for ConcatSource.
  • Combine per-child hashes into a single value that is fed into the caller-provided Hasher.
  • Add a new unit test intended to assert hash determinism for many children.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/concat_source.rs Parallelizes ConcatSource hashing via Rayon and adds a determinism-oriented unit test.
Cargo.toml Moves/adds rayon under [dependencies] to support use in library code.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +901 to +908
let mut hasher1 = FxHasher::default();
source.hash(&mut hasher1);
let hash1 = hasher1.finish();

let mut hasher2 = FxHasher::default();
source.hash(&mut hasher2);
let hash2 = hasher2.finish();

Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FxHasher::finish() is a method from the std::hash::Hasher trait, but Hasher isn’t imported in this test module. As written, the calls to hasher1.finish() / hasher2.finish() won’t compile unless std::hash::Hasher is brought into scope (or std::hash::Hasher::finish(&hasher) is used).

Copilot uses AI. Check for mistakes.
Comment on lines +242 to +246
let mut combined = FxHasher::default();
for child_hash in child_hashes {
child_hash.hash(&mut combined);
}
combined.finish().hash(state);
Copy link

Copilot AI Feb 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After computing child_hashes, the extra combined FxHasher pass adds another hashing round and a finalization before writing into state. You can hash the u64 values directly into state in order (after the parallel collection) to reduce work and simplify the implementation.

Suggested change
let mut combined = FxHasher::default();
for child_hash in child_hashes {
child_hash.hash(&mut combined);
}
combined.finish().hash(state);
for child_hash in child_hashes {
child_hash.hash(state);
}

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant